Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(storagenode): restore uncommitted logs #492

Merged
merged 1 commit into from
Jul 28, 2023
Merged

Conversation

ijsong
Copy link
Member

@ijsong ijsong commented Jun 22, 2023

What this PR does

Currently, all log stream replicas belonging to the storage nodes that are just
restarted can't commit logs written before restarting if all log stream
replicas in a log stream were restarted simultaneously. They have logs
uncommitted in their storages, but they can't process Commit RPC sent from the
metadata repository.

This PR fixes the above issue. While recovering the log stream context after
restarting the storage nodes, it restores uncommitted logs.

Which issue(s) this PR resolves

Resolves #490

@ijsong
Copy link
Member Author

ijsong commented Jun 22, 2023

@ijsong ijsong self-assigned this Jun 22, 2023
@ijsong ijsong linked an issue Jun 22, 2023 that may be closed by this pull request
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 0f85da0 to aee027a Compare June 22, 2023 14:20
@ijsong ijsong force-pushed the unskip_TestExecutorRestore branch from 11968c6 to 50212da Compare June 27, 2023 01:21
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from aee027a to 3c767bf Compare June 27, 2023 01:21
@ijsong ijsong force-pushed the unskip_TestExecutorRestore branch from 50212da to e1a5841 Compare June 27, 2023 03:24
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 3c767bf to 9e8209f Compare June 27, 2023 03:24
@ijsong ijsong force-pushed the unskip_TestExecutorRestore branch from e1a5841 to 9cdc574 Compare July 4, 2023 06:45
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 9e8209f to aee1b52 Compare July 4, 2023 06:45
@ijsong ijsong force-pushed the unskip_TestExecutorRestore branch from 9cdc574 to 3957d42 Compare July 17, 2023 00:59
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from aee1b52 to 6f87f44 Compare July 17, 2023 00:59
Currently, all log stream replicas belonging to the storage nodes that are just restarted can't
commit logs written before restarting if all log stream replicas in a log stream were restarted
simultaneously. They have logs uncommitted in their storages, but they can't process Commit RPC sent
from the metadata repository.

This PR fixes the above issue. While recovering the log stream context after restarting the storage
nodes, it restores uncommitted logs.

Resolves #490
@ijsong ijsong force-pushed the unskip_TestExecutorRestore branch from 3957d42 to 43b59f4 Compare July 28, 2023 08:40
@ijsong ijsong force-pushed the restore_uncommitted_logs branch from 6f87f44 to 267cccc Compare July 28, 2023 08:40
@ijsong
Copy link
Member Author

ijsong commented Jul 28, 2023

@ijsong started a stack merge that includes this pull request via Graphite.

Base automatically changed from unskip_TestExecutorRestore to main July 28, 2023 09:54
@ijsong ijsong merged commit a9832ee into main Jul 28, 2023
@ijsong
Copy link
Member Author

ijsong commented Jul 28, 2023

@ijsong merged this pull request with Graphite.

@ijsong ijsong deleted the restore_uncommitted_logs branch July 28, 2023 09:54
ijsong added a commit that referenced this pull request Aug 7, 2023
🤖 I have created a release *beep* *boop*
---


## [0.15.0](v0.14.1...v0.15.0) (2023-07-31)


### Features

* **admin:** add otelgrpc metric interceptor ([d9ca9aa](d9ca9aa))
* **admin:** add otelgrpc metric interceptor ([#509](#509)) ([db7a1a2](db7a1a2))
* **admin:** speed up fetching cluster metadata ([3e46f62](3e46f62))
* **admin:** speed up fetching cluster metadata ([#480](#480)) ([53a8f19](53a8f19))
* **all:** add common flags for telemetry ([fcacd1a](fcacd1a))
* **all:** add common flags for telemetry ([#494](#494)) ([63355e9](63355e9))
* **benchmark:** share a connection between appenders in a target ([7dc53e9](7dc53e9))
* **benchmark:** share a connection between appenders in a target ([#524](#524)) ([2cd9196](2cd9196))
* **client:** add Clear to the log stream appender manager ([9a89065](9a89065))
* **client:** add Clear to the log stream appender manager ([#514](#514)) ([e5b6a2e](e5b6a2e))
* **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([db39713](db39713))
* **storagenode:** add --storage-trim-delay to set a delay before the deletion of log entries ([#529](#529)) ([015bfa4](015bfa4))
* **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([83b7496](83b7496))
* **storagenode:** add --storage-trim-rate to set throttling rate of Trim ([#530](#530)) ([6e69306](6e69306))
* **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([b181132](b181132))
* **telemetry:** customize bucket size of process.runtime.go.gc.pause_ns ([#510](#510)) ([9d99520](9d99520))
* **telemetry:** customize bucket size of rpc.server.duration ([a0e5973](a0e5973))
* **telemetry:** customize bucket size of rpc.server.duration ([#511](#511)) ([e41fe1c](e41fe1c))


### Bug Fixes

* **benchmark:** make append duration's precision high ([e3a091d](e3a091d))
* **benchmark:** make append duration's precision high ([#522](#522)) ([815af53](815af53))
* **benchmark:** support graceful stop ([8616d55](8616d55))
* **benchmark:** support graceful stop ([#527](#527)) ([fc4ed81](fc4ed81))
* **metarepos:** add TestMRIgnoreDirtyReport ([fe2a550](fe2a550))
* **metarepos:** allow set commitTick ([bdca20a](bdca20a))
* **metarepos:** ignore invalid report ([e8620de](e8620de))
* **storagenode:** ignore context error while checking to interleave of Append RPC errors ([04d1052](04d1052))
* **storagenode:** ignore context error while checking to interleave of Append RPC errors ([#504](#504)) ([5a7a3b0](5a7a3b0))
* **storagenode:** restore uncommitted logs ([267cccc](267cccc)), closes [#490](#490)
* **storagenode:** restore uncommitted logs ([#492](#492)) ([a9832ee](a9832ee)), closes [#490](#490)


### Performance Improvements

* **admin:** use singleflight to handle Admin's RPCs ([c231888](c231888))
* **admin:** use singleflight to handle Admin's RPCs ([#482](#482)) ([1a6a96d](1a6a96d))
* **metarepos:** add a pool for []*mrpb.Report ([fa8c89d](fa8c89d))
* **metarepos:** add a pool for []*mrpb.Report ([#534](#534)) ([16b2181](16b2181))
* **metarepos:** add a pool for *mrpb.RaftEntry ([be9f121](be9f121))
* **metarepos:** add a pool for *mrpb.RaftEntry ([#536](#536)) ([96ab5e2](96ab5e2))
* **metarepos:** add a pool for mrpb.Reports ([59a6a5a](59a6a5a))
* **metarepos:** add a pool for mrpb.Reports ([#533](#533)) ([b227c75](b227c75))
* **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([a775628](a775628))
* **metarepos:** avoid copy overhead by removing unnecessary converting from byte slice to string ([#532](#532)) ([1702769](1702769))
* **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([57d8039](57d8039))
* **metarepos:** reuse mrpb.StorageNodeUncommitReport while changed ([#537](#537)) ([8f6e097](8f6e097))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

storagenode: can't commit written logs after restarting
2 participants